Privacy Preserving Clustering by Data Transformation

نویسندگان

  • Stanley R. M. Oliveira
  • Osmar R. Zaïane
چکیده

Despite its benefit in a wide range of applications, data mining techniques also have raised a number of ethical issues. Some such issues include those of privacy, data security, intellectual property rights, and many others. In this paper, we address the privacy problem against unauthorized secondary use of information. To do so, we introduce a family of geometric data transformation methods (GDTMs) which ensure that the mining process will not violate privacy up to a certain degree of security. We focus primarily on privacy preserving data clustering, notably on partition-based and hierarchical methods. Our proposed methods distort only confidential numerical attributes to meet privacy requirements, while preserving general features for clustering analysis. Our experiments demonstrate that our methods are effective and provide acceptable values in practice for balancing privacy and accuracy. We report the main results of our performance evaluation and discuss some open research issues.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Revisiting "Privacy Preserving Clustering by Data Transformation"

Preserving the privacy of individuals when data are shared for clustering is a complex problem. The challenge is how to protect the underlying data values subjected to clustering without jeopardizing the similarity between objects under analysis. In this short paper, we revisit a family of geometric data transformation methods (GDTMs) that distort numerical attributes by translations, scalings,...

متن کامل

A Model Based Framework for Privacy Preserving Clustering Using SOM

Privacy has become an important issue in the progress of data mining techniques. Many laws are being enacted in various countries to protect the privacy of data. This privacy concern has been addressed by developing data mining techniques under a framework called privacy preserving data mining. Presently there are two main approaches popularly used -data perturbation and secure multiparty compu...

متن کامل

Privacy Preserving Based on PCA Transformation Using Data Perturbation Technique

Maintain confidentiality, privacy and security research in data mining (PPDM) is one of the biggest trends. Recent advances in data collection, data dissemination and related technologies have inaugurated a new era of research where existing data mining algorithms should be reconsidered from a different point of view, this of privacy preservation. We propose a simple PCA based transformation ap...

متن کامل

An Effective Data Transformation Approach for Privacy Preserving Clustering

A new stream of research privacy preserving data mining emerged due to the recent advances in data mining, Internet and security technologies. Data sharing among organizations considered to be useful which offer mutual benefit for business growth. Preserving the privacy of shared data for clustering was considered as the most challenging problem. To overcome the problem, the data owner publishe...

متن کامل

Privacy Preserving Clustering by Hybrid Data Transformation Approach

Numerous organizations collect and share large amounts of data due to the proliferation of information technologies and internet. The information extracted from these databases through data mining process may reveal private information of individuals. Privacy preserving data mining is a new research area, which allows sharing of privacy-sensitive data for analysis purpose. In this paper a hybri...

متن کامل

Augmented Rotation-Based Transformation for Privacy-Preserving Data Clustering

© 2010 Dowon Hong et al. 351 Multiple rotation-based transformation (MRBT) was introduced recently for mitigating the apriori-knowledge independent component analysis (AK-ICA) attack on rotation-based transformation (RBT), which is used for privacy-preserving data clustering. MRBT is shown to mitigate the AK-ICA attack but at the expense of data utility by not enabling conventional clustering. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JIDM

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2003